Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Llama2 from the Turbine implementation to the Sharktank one #2170

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

gpetters-amd
Copy link
Contributor

There are still two outstanding issues I'd like some comments on, but otherwise this should be basically done.

)
# TODO: Convert to gguf, delete cache
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way that sharktank recommends for generating the .gguf file is to use a CLI tool from llama.cpp. Is that still the best way to extract that, or do we have a way to do it using sharktank?

model = PagedLlamaModelV1(dataset.root_theta, llama_config)

fxb = FxProgramsBuilder(model)
self.torch_ir = export(fxb)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why, but this is producing an empty module. Any idea what I'm missing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant